Other limited dependent variable models

Neil Lund

2025-01-13

Count models

Count models can be used to model outcomes that range from 0 to \(\infty\). These tend to be more common in international relations, but it includes things like:

  • Number of battle deaths in a conflict

  • Number of terror attacks in a country

  • Number of protests in a year

  • Number of years between coups (although this may be classified as a survival model)

Poisson Distribution

The simplest way to model a count is with a Poisson distribution

  • A poisson distribution has a single parameter, \(\lambda\) , which represents the average rate of occurrence over a fixed time period.

  • As with logit models, this is non-linear and solving it requires maximum likelihood methods.

By Skbkekas - Own work, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=9447142

By Skbkekas - Own work, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=9447142

Poisson Distribution

The expected value of \(Y|X\) is

\[ E(Y|X) = e ^{(b_0 + b_1X_1 +b_2X_2...)} \]

poisson_func <- function(x,beta0,beta1) {
  theta <- exp(beta0 + beta1*x)
}
curve(poisson_func(x,0,1),xlim=c(-1,5), lwd=2,ylab="poisson_func function", las=1)
curve(poisson_func(x,-2,1),xlim=c(-1,5), lwd=2, lty=2, col="red", add=TRUE)
curve(poisson_func(x,2,1),xlim=c(-1,5), lwd=2, lty=2, col="blue", add=TRUE)

legend(-1,150,c(expression(paste(beta[0]," = 0")),
                expression(paste(beta[0]," = -2")),
                expression(paste(beta[0]," = 2"))), 
       lwd=2, lty=1:3, col=c("black","red","blue"))

Poisson Distribution

poisson_func <- function(x,beta0,beta1) {
  theta <- exp(beta0 + beta1*x)
}
curve(poisson_func(x,3,-2),xlim=c(-1,5), lwd=2,ylab="poisson_func function", las=1)
curve(poisson_func(x,3,0),xlim=c(-1,5), lwd=2, lty=2, col="blue", add=TRUE)
curve(poisson_func(x,3,2),xlim=c(-1,5), lwd=2, lty=2, col="red", add=TRUE)

legend(3,150,c(expression(paste(beta[1]," = 0")),
               expression(paste(beta[1]," = -2")),
               expression(paste(beta[1]," = 2"))), 
       lwd=2, lty=1:3, col=c("blue","black", "red"))

Poisson Overdispersion

The poisson distribution assumes that the variance is equal to the mean:

set.seed(100)
x<-rpois(4000, 25)
var(x)
[1] 24.87314
mean(x)
[1] 25.01825

Poisson Overdispersion

The poisson distribution assumes that the variance is equal to the mean:

But this is rarely true with real world data:

  # count of terrorist attacks per country-year
mean(gtd_counts$nkill)
[1] 2.238636
var(gtd_counts$nkill)
[1] 63.0513
data.frame(dist = dist_poisson(lambda =mean(gtd_counts$nkill)))|>
  ggplot(aes(xdist = dist))+
  stat_slab(alpha=.7) +
  stat_slab(data=gtd_counts, aes(x=nkill), 
               fill='lightblue', 
               inherit.aes = FALSE) +
  theme_bw() +
  labs(title = 'Terrorist attacks in 2018 compared to a poisson distribution with the same mean') 

Poisson Overdispersion

  • Models with more variance than expected are called “overdispersed”, and this causes us to underestimate our standard errors.

    • This can also happen with other distributions, but its probably less common because most distributions allow the variance to be different from the mean.
  • This is may be the result of different data generating processes, i.e.: one pattern for states with terrorist insurgencies

    • In some cases there may be a way to fix this with a different set of parameters, but there’s no guarantee.

Options for handling overdispersion

  • Quasipoisson models optimize a different likelihood function that can account for overdispersion.

  • A random effects model with one random effect per observation

  • Hurdle or zero-inflated models (in special cases)

  • A negative binomial model

Negative Binomial Model

  • The negative binomial distribution can be thought of as the number of failures until a specified number of successes. Unlike the poisson distribution, the negative binomial distribution has a mean and a variance, so it can account for overdispersion:
tibble('a' = dist_negative_binomial(size=1, .5),
       'b' = dist_negative_binomial(size=10, .5),
       'c' = dist_negative_binomial(size=100, .5)
       )|>
  ggplot()+
  stat_slab(aes(xdist = a), fill='lightblue', alpha=.7) +
  stat_slab(aes(xdist = b), fill='lightgreen', alpha=.7) +
  stat_slab(aes(xdist = c), fill='orange', alpha=.7) +
  theme_bw() 

Poisson vs. negative binomial

model<-glm(nkill ~ v2x_libdem + any_attacks + e_gdppc ,
           data=gtd_counts, family='poisson')
nb_model<-glm.nb(nkill ~ v2x_libdem + any_attacks + e_gdppc , 
                 data=gtd_counts)
tinytable_tpmca8nggijr4t2ysvrk
poisson negative binomial
95% CI in brackets
liberal democracy score -1.718 -0.781
[-2.166, -1.267] [-3.496, 1.947]
gdp per capita 0.031 0.007
[0.024, 0.038] [-0.034, 0.051]
any attacks in prior year 2.888 2.544
[2.448, 3.386] [1.358, 3.723]
Num.Obs. 176 176
AIC 1525.1 413.0
BIC 1537.8 428.8
Log.Lik. -758.548 -201.481
F 63.204 7.386
RMSE 7.60 7.69

Testing Overdispersion

The performance package has a function to check for overdispersion using simulated residuals from a count model. The null hypothesis here is no overdispersion:

library(performance)
check_overdispersion(model)
# Overdispersion test

       dispersion ratio =   18.614
  Pearson's Chi-Squared = 3201.581
                p-value =  < 0.001

We can also see this is addressed by the negative binomial model:

check_overdispersion(nb_model)
# Overdispersion test

 dispersion ratio = 0.689
          p-value = 0.864

Model fit

Since the only difference here is the addition of a single dispersion parameter, we can treat these two models as nested and use a likelihood ratio test to determine with the negative binomial model is a better fit. Unsurprisingly, it is:

pchisq(2 * (logLik(nb_model) - logLik(model)), df = 1, lower.tail = FALSE)
'log Lik.' 2.798821e-244 (df=5)

Poisson Prediction and interpretation

The negative binomial model and the poisson model both have the same inverse link function, so generating predictions here is the same:

  • Calculate the linear prediction (plug in the values and multiple each by the coefficient)

  • Exponentiate the result

    coefs<-coef(nb_model)
    exp(coefs['(Intercept)'] + 
          coefs['v2x_libdem'] * 0.74 + 
          coefs["any_attacksyes"] * 1 +
          coefs['e_gdppc'] * 59.6
    
        )
    (Intercept) 
       3.581757 

Count models are still nonlinear! So, like with logits, the marginal effect of X will depend in part on the other parameters.

Incidence Rates

Exponentiated coefficients can also be interpreted as the effect of a one unit increase in X on the rate ratio of Y, so a coefficient of 2 means the rate doubles, .5 means its cut in half, and 0 means it stays the same.

modelsummary(model_list,
             coef_map  = map,
             coef_omit = "Intercept",
             conf_level = .95,        
             note = "95% CI in brackets. Exponentiated coefficients",
             exponentiate=TRUE
             )
tinytable_glfx1tcg75u20kox7p79
poisson negative binomial
95% CI in brackets. Exponentiated coefficients
liberal democracy score 0.179 0.458
(0.041) (0.536)
gdp per capita 1.032 1.007
(0.003) (0.018)
any attacks in prior year 17.953 12.734
(4.275) (6.994)
Num.Obs. 176 176
AIC 1525.1 413.0
BIC 1537.8 428.8
Log.Lik. -758.548 -201.481
F 63.204 7.386
RMSE 7.60 7.69

Predictions and Marginal Effects

library(ggeffects)

predict_response(nb_model, term=c('e_gdppc',
                                  'any_attacks'),margin='empirical')|>
  plot(show_data=TRUE,  jitter=.3) +
  labs(title = "Predicted deaths in terrorist incidents", 
       y = "number killed"
       )

Predictions and Marginal Effects

predict_response(nb_model, 
                 term=c('any_attacks'),
                 margin='empirical')|>
  plot() +
  labs(title = "Predicted deaths in terrorist incidents", 
       y = "number killed")

Predictions and Marginal Effects

library(marginaleffects)
mfx<-avg_comparisons(nb_model)

mfx

        Term Contrast Estimate Std. Error      z Pr(>|z|)   S   2.5 % 97.5 %
 any_attacks yes - no   3.3055     1.1976  2.760  0.00578 7.4  0.9583 5.6526
 e_gdppc     +1         0.0143     0.0388  0.370  0.71163 0.5 -0.0616 0.0903
 v2x_libdem  +1        -1.1807     1.2650 -0.933  0.35066 1.5 -3.6601 1.2987

Type:  response 
Columns: term, contrast, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 

Using a Random Effect

library(lme4)


mmodel<-glmer( nkill ~ 
                v2x_libdem + 
                any_attacks + 
                e_gdppc +
                (1|country_txt), data=gtd_counts, family='poisson')




performance::check_overdispersion(mmodel)
# Overdispersion test

       dispersion ratio = 0.012
  Pearson's Chi-Squared = 2.001
                p-value =     1

Considerations

  • For count data, a poisson model is a starting point, but over-dispersion is so common in real-world data that you should never assume it and you should be a bit skeptical if you a poisson with no discussion of the problem

  • When in doubt, use a negative binomial model. If there’s no overdispersion, you should get roughly the same results as a poisson.

Other models

  • Ordered logit: response variables with just a few values (for instance, likert responses on a survey)

    • Coefficients are the effect of a one unit increase in the logged odds of moving up one category
  • Multinomial logit: responses for multiple discrete categories

    • K-1 coefficients for each category of the DV.

    • Coefficients are the effect on the logged odds of being in a category relative to the baseline group.

    • Fun fact: the sigmoid function is also used for neural networks